skip to main content


Search for: All records

Creators/Authors contains: "David, Maude M."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Large-scale microbiome studies investigating disease-inducing microbial roles base their findings on differences between microbial count data in contrasting environments (e.g., stool samples between cases and controls). These microbiome survey studies are often impeded by small sample sizes and database bias. Combining data from multiple survey studies often results in obvious batch effects, even when DNA preparation and sequencing methods are identical. Relatedly, predictive models trained on one microbial DNA dataset often do not generalize to outside datasets. In this study, we address these limitations by applying word embedding algorithms (GloVe) and PCA transformation to ASV data from the American Gut Project and generating translation matrices that can be applied to any 16S rRNA V4 region gut microbiome sequencing study. Because these approaches contextualize microbial occurrences in a larger dataset while reducing dimensionality of the feature space, they can improve generalization of predictive models that predict host phenotype from stool associated gut microbiota. The GMEmbeddings R package contains GloVe and PCA embedding transformation matrices at 50, 100 and 250 dimensions, each learned using ∼15,000 samples from the American Gut Project. It currently supports the alignment, matching, and matrix multiplication to allow users to transform their V4 16S rRNA data into these embedding spaces. We show how to correlate the properties in the new embedding space to KEGG functional pathways for biological interpretation of results. Lastly, we provide benchmarking on six gut microbiome datasets describing three phenotypes to demonstrate the ability of embedding-based microbiome classifiers to generalize to independent datasets. Future iterations of GMEmbeddings will include embedding transformation matrices for other biological systems. Available at: https://github.com/MaudeDavidLab/GMEmbeddings . 
    more » « less
  2. Kinkel, Linda (Ed.)
    ABSTRACT A growing body of research has established that the microbiome can mediate the dynamics and functional capacities of diverse biological systems. Yet, we understand little about what governs the response of these microbial communities to host or environmental changes. Most efforts to model microbiomes focus on defining the relationships between the microbiome, host, and environmental features within a specified study system and therefore fail to capture those that may be evident across multiple systems. In parallel with these developments in microbiome research, computer scientists have developed a variety of machine learning tools that can identify subtle, but informative, patterns from complex data. Here, we recommend using deep transfer learning to resolve microbiome patterns that transcend study systems. By leveraging diverse public data sets in an unsupervised way, such models can learn contextual relationships between features and build on those patterns to perform subsequent tasks (e.g., classification) within specific biological contexts. 
    more » « less
  3. Abstract

    Autism Spectrum Disorder (ASD) is a complex neurodevelopmental disorder influenced by both genetic and environmental factors. Recently, gut dysbiosis has emerged as a powerful contributor to ASD symptoms. In this study, we recruited over 100 age-matched sibling pairs (between 2 and 8 years old) where one had an Autism ASD diagnosis and the other was developing typically (TD) (432 samples total). We collected stool samples over four weeks, tracked over 100 lifestyle and dietary variables, and surveyed behavior measures related to ASD symptoms. We identified 117 amplicon sequencing variants (ASVs) that were significantly different in abundance between sibling pairs across all three timepoints, 11 of which were supported by at least two contrast methods. We additionally identified dietary and lifestyle variables that differ significantly between cohorts, and further linked those variables to the ASVs they statistically relate to. Overall, dietary and lifestyle features were explanatory of ASD phenotype using logistic regression, however, global compositional microbiome features were not. Leveraging our longitudinal behavior questionnaires, we additionally identified 11 ASVs associated with changes in reported anxiety over time within and across all individuals. Lastly, we find that overall microbiome composition (beta-diversity) is associated with specific ASD-related behavioral characteristics.

     
    more » « less